NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Using artificial intelligence to model expert panel diagnosis of cholecystitis severity

https://doi.org/10.1007/s00464-025-12015-6

Olsen, Griffin H; Goodman, Emmett D; Aklilu, Josiah G; Bartoletti, Sebastiano; Hung, Kay S; Yang, Janice H; Sorenson, Eric C; Jopling, Jeffrey K; Yeung, Serena Y; Azagury, Dan E (October 2025, Surgical Endoscopy)

Full Text Available
ALGES: Active Learning with Gradient Embeddings for Semantic Segmentation of Laparoscopic Surgical Images

Aklilu, Josiah; Yeung, Serena (January 2022, Proceedings of Machine Learning for Healthcare)

Annotating medical images for the purposes of training computer vision models is an extremely laborious task that takes time and resources away from expert clinicians. Active learning (AL) is a machine learning paradigm that mitigates this problem by deliberately proposing data points that should be labeled in order to maximize model performance. We propose a novel AL algorithm for segmentation, ALGES, that utilizes gradient embeddings to effectively select laparoscopic images to be labeled by some external oracle while reducing annotation effort. Given any unlabeled image, our algorithm treats predicted segmentations as truth and computes gradients with respect to the model parameters of the last layer in a segmentation network. The norms of these per-pixel gradient vectors correspond to the magnitude of the induced change in model parameters and contain rich information about the model’s predictive uncertainty. Our algorithm then computes gradients embeddings in two ways, and we employ a center-finding algorithm with these embeddings to procure representative and diverse batches in each round of AL. An advantage of our approach is extensibility to any model architecture and differentiable loss scheme for semantic segmentation. We apply our approach to a public data set of laparoscopic cholecystectomy images and show that it outperforms current AL algorithms in selecting the most informative data points for improving the segmentation model. Our code is available at https://github.com/josaklil-ai/surg-active-learning.
more » « less
Full Text Available
Domain Adaptive 3D Pose Augmentation for In-the-wild Human Mesh Recovery

https://doi.org/10.48550/arXiv.2206.10457

Weng, Zhenzhen; Wang, Kuan-Chieh; Kanazawa, Angjoo; Yeung, Serena (January 2022, International Conference on 3D Vision)

The ability to perceive 3D human bodies from a single image has a multitude of applications ranging from entertainment and robotics to neuroscience and healthcare. A fundamental challenge in human mesh recovery is in collecting the ground truth 3D mesh targets required for training, which requires burdensome motion capturing systems and is often limited to indoor laboratories. As a result, while progress is made on benchmark datasets collected in these restrictive settings, models fail to generalize to real-world "in-the-wild" scenarios due to distribution shifts. We propose Domain Adaptive 3D Pose Augmentation (DAPA), a data augmentation method that enhances the model's generalization ability in in-the-wild scenarios. DAPA combines the strength of methods based on synthetic datasets by getting direct supervision from the synthesized meshes, and domain adaptation methods by using ground truth 2D keypoints from the target dataset. We show quantitatively that finetuning with DAPA effectively improves results on benchmarks 3DPW and AGORA. We further demonstrate the utility of DAPA on a challenging dataset curated from videos of real-world parent-child interaction.
more » « less
Full Text Available
Holistic 3D Human and Scene Mesh Estimation from Single View Images.

Weng, Zhenzhen; Yeung, Serena (January 2021, IEEE Xplore digital library)
null (Ed.)
The 3D world limits the human body pose and the hu- man body pose conveys information about the surrounding objects. Indeed, from a single image of a person placed in an indoor scene, we as humans are adept at resolving am- biguities of the human pose and room layout through our knowledge of the physical laws and prior perception of the plausible object and human poses. However, few computer vision models fully leverage this fact. In this work, we pro- pose a holistically trainable model that perceives the 3D scene from a single RGB image, estimates the camera pose and the room layout, and reconstructs both human body and object meshes. By imposing a set of comprehensive and sophisticated losses on all aspects of the estimations, we show that our model outperforms existing human body mesh methods and indoor scene reconstruction methods. To the best of our knowledge, this is the first model that outputs both object and human predictions at the mesh level, and performs joint optimization on the scene and human poses.
more » « less
Full Text Available

Search for: All records